Formant Estimation and Tracking Using Deep Learning

نویسندگان

  • Yehoshua Dissen
  • Joseph Keshet
چکیده

Formant frequency estimation and tracking are among the most fundamental problems in speech processing. In the former task the input is a stationary speech segment such as the middle part of a vowel and the goal is to estimate the formant frequencies, whereas in the latter task the input is a series of speech frames and the goal is to track the trajectory of the formant frequencies throughout the signal. Traditionally, formant estimation and tracking is done using ad-hoc signal processing methods. In this paper we propose using machine learning techniques trained on an annotated corpus of read speech for these tasks. Our feature set is composed of LPC-based cepstral coefficients with a range of model orders and pitch-synchronous cepstral coefficients. Two deep network architectures are used as learning algorithms: a deep feed-forward network for the estimation task and a recurrent neural network for the tracking task. The performance of our methods compares favorably with mainstream LPC-based implementations and state-of-the-art tracking algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Domain Adaptation For Formant Estimation Using Deep Learning

In this paper we present a domain adaptation technique for formant estimation using a deep network. We first train a deep learning network on a small read speech dataset. We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speec...

متن کامل

A Formant Tracking Lp Model for Speech Processing in Car/train Noise

Formant estimation becomes complicated in the presence of correlated background noise such as car and train noise as the spectrum of noise from revolving mechanical sources have their own spectral peaks that affect the number and positions of the observed peaks in noisy speech spectrum. This paper investigates the modeling and estimation of spectral parameters at formants of noisy speech in the...

متن کامل

Formant tracking linear prediction model using HMMs and Kalman filters for noisy speech processing

This paper presents a formant tracking linear prediction (LP) model for speech processing in noise. The main focus of this work is on the utilization of the correlation of the energy contours of speech, along the formant tracks, for improved formant and LP model estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of the inter-fr...

متن کامل

A formant tracking LP model for speech processing

This paper investigates the modeling and estimation of spectral parameters at formants of noisy speech in the presence of car and train noise. Formant estimation using twodimensional hidden Markov models (2D-HMM) is reviewed and employed to study the influence of noise on observations of formants. The first set of experimental results presented show the influence of car and train noise on the d...

متن کامل

Formant Tracking Linear Prediction Model using HMMs for Noisy Speech Processing

This paper presents a formant-tracking linear prediction (FTLP) model for speech processing in noise. The main focus of this work is the detection of formant trajectory based on Hidden Markov Models (HMM), for improved formant estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of a timesequence of peaks which satisfies continui...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016